The performance of EuroSCORE II in CABG patients in relation to sex, age, and surgical risk: a nationwide study in 14,118 patients

Background To determine the discriminative accuracy and calibration of EuroSCORE II in relation to age, sex, and surgical risk in a large nationwide coronary artery bypass grafting (CABG) cohort. Methods All 14,118 patients undergoing isolated CABG in Sweden during 2012–2017 were included. Individual patient data were taken from the SWEDEHEART registry. Patients were divided by age (< 60, 60–69, 70–79, ≥ 80 years), sex, and surgical risk (low: EuroSCORE < 4%, intermediate: 4–8%, high: > 8%). Discriminative accuracy was determined by the area under the receiver operating characteristic curve (AUC) and calibration by the observed/estimated (O/E) mortality ratio at 30 days. Results AUC and O/E ratio were 0.82 (95% CI 0.79–0.85) and 0.58 (0.50–0.66) overall, 0.82 (0.79–0.86) and 0.57 (0.48–0.66) in men, and 0.79 (0.73–0.85) and 0.60 (0.47–0.75) in women. Regarding age, discriminative accuracy was highest in patients aged 60–69 years (AUC: 0.86 [0.80–0.93]) but was satisfactory in all groups (AUC: 0.74–0.80). O/E ratio varied from 0.26 for patients > 60 years to 0.90 for patients > 80 years. Regarding surgical risk, AUC and O/E ratio were 0.63 (0.44–0.83) and 0.18 (0.09–0.30) in low-risk patients, 0.60 (0.55–0.66) and 0.57 (0.46–0.68) in intermediate-risk patients, and 0.78 (0.73–0.83) and 0.78 (0.64–0.92) in high-risk patients. Conclusions EuroSCORE II had good discriminative accuracy independently of sex and age, but markedly overestimated mortality risk, especially in younger patients. Accuracy and calibration were better in high-risk patients than in low-risk and intermediate-risk patients. Supplementary Information The online version contains supplementary material available at 10.1186/s13019-023-02141-4.


Introduction
Several risk stratification models based on patient characteristics, comorbidities, and type of surgical procedure have been developed to estimate the mortality risk after cardiac surgery [1]. The European System for Cardiac Operative Risk Evaluation (EuroSCORE), first introduced in 1999, was designed to improve patient selection and became widely adopted [2]. However, as perioperative and postoperative care improved, the discriminative accuracy and calibration of EuroSCORE I decreased. A new version, EuroSCORE II, which outperforms Euro-SCORE I for risk stratification, was therefore introduced in 2011 [3]. Today, EuroSCORE II and the Society of Thoracic Surgery Predicted Risk of Mortality (STS-PROM) are the most widely recognized and utilized risk stratification tools [4,5]. EuroSCORE II and STS-PROM have comparable discriminative accuracy and calibration regarding in-hospital and 30-day mortality  18:40 in coronary artery bypass grafting (CABG) and in aortic valve replacement (AVR) patients [5][6][7][8].
Previous analyses of cardiac surgery risk scores, including EuroSCORE II, have noted that the scores overestimate the risk of death after CABG in octogenarians [9][10][11][12][13][14] and in high-risk patients [15,16]. However, most of these studies were performed in single-centre cohorts of limited size, and large contemporary population-based studies are lacking. In the present study, we hypothesized on the basis of these previous studies that EuroSCORE II would perform less well in octogenarians and in high-risk patients. To test this hypothesis, we assessed the predictive accuracy and calibration of EuroSCORE II in different age groups, in men and women, and in patients with low, intermediate, and high surgical risk, using a large nationwide cohort of CABG patients.

Study population
All consecutive patients > 18 years of age who underwent first-time isolated CABG in Sweden between 1 January 2012 and 30 November 2017 were identified in the Swedish Cardiac Surgery Registry [17], which is part of the Swedish Web System for Enhancement and Development of Evidence-Based Care in Heart Disease Evaluated According to Recommended Therapies registry (SWEDEHEART) [18]. All patients were followed up until death, emigration, or 31 December 2017, whichever occurred first. The study cohort were divided into groups based on age at the time of CABG (< 60, 60-69, 70-79, and ≥ 80 years), sex, and risk group according to the EuroSCORE II surgical risk, with low risk defined as EuroSCORE < 4%, intermediate risk as 4-8%, and high risk as > 8%.

Data sources
Individual patient data from two nationwide registries were merged on the basis of the personal identification number which all Swedish residents are given at birth or shortly after immigration [19]. Operative details and patient characteristics including EuroSCORE II were extracted from the Swedish Heart Surgery Registry, which prospectively collects detailed information, including risk stratification, on all cardiac surgery patients and operations performed in Sweden since 1992 and has a coverage of 98-99% [17]. Mortality was extracted from the Cause of Death register, which has collected information on date and cause of death based on ICD codes since 1961 [20].

EuroSCORE II
EuroSCORE II estimates the 30-day mortality risk after cardiac surgery, expressed as a percentage. The variables included in EuroSCORE II are age, sex, presence of renal impairment, extracardiac arteriopathy, poor mobility, previous cardiac surgery, chronic pulmonary disease, active endocarditis, critical preoperative state, insulin-treated diabetes mellitus, New York Heart Association (NYHA) class of heart failure, unstable angina defined as Canadian Cardiology Society (CCS) class 4 angina, left ventricular function (LVEF; > 50%, 30-50%, 20-30%, < 20%), recent myocardial infarction (within 90 days), pulmonary hypertension, urgency of the procedure, weight of the intervention, and surgery on the thoracic aorta [3].

Outcome
The outcome was all-cause mortality defined as any death occurring between the start of surgery and 30 days after isolated CABG. The expected 30-day mortality, based on the calculated EuroSCORE II for each patient, was compared with the observed 30-day mortality.

Statistical analysis
Continuous variables were described as means and standard deviations and categorical variables as numbers and percentages. The discriminative accuracy was calculated with c-statistics [21] from a logistic regression and reported as the area under the receiver operating characteristic curve (AUC) with 95% confidence intervals (CIs), both for all patients and stratified by age group, risk group, and sex. Receiver operating characteristic curves and AUC were used to analyse the sensitivity and specificity of expected versus observed mortality within 30 days after surgery. The observed 30-day all-cause mortality was compared with the expected 30-day mortality, based on the calculated EuroSCORE II for each patient. The comparison was achieved by calculating the ratio of observed versus estimated mortality (O/E ratio) for all patients and for the respective groups. In addition, 95% CIs were constructed for the ratios with the bootstrap percentile method using 1000 bootstrapped samples. All tests were two-sided and conducted at the 5% significance level. All statistical analyses were performed using version 9.4 of SAS (Cary, NC).

General
The study population consisted of 14,118 consecutive CABG patients. Their mean age was 68.5 years, and 18.3% were women. Baseline characteristics for patients are presented according to age in Table 1, and according to sex and surgical risk in Additional file 1: Tables S1 and S2. The proportions of comorbidities increased by age group except for diabetes, which was less common in the more elderly patients ( Table 1). The proportions of men   Table S1). Baseline characteristics by risk group are given in Additional file 1: Table S2.

Mortality
Overall, the actual 30-day mortality was 1.5% for all patients, 1.3% in men, and 2.3% in women (  Table 2).

Performance of the EuroSCORE II model in CABG patients by age group
The overall discriminative accuracy of EuroSCORE II in the study population was good (AUC: 0.82; 95% CI 0.79-0.85; Fig. 1A).

Performance of the EuroSCORE II model in CABG patients by surgical risk group
The discriminative accuracy of EuroSCORE II was best among patients with high surgical risk (AUC: 0.78, 95% CI 0.73-0.83; Fig. 1C (Fig. 2B).

Discussion
In this population-based study, we investigated the discrimination accuracy and calibration of the EuroSCORE II risk stratification tool in a large nationwide cohort of CABG patients. The main findings were as follows. Firstly, EuroSCORE II had good discriminative accuracy independently of sex and age, but markedly overestimated the mortality risk, especially in younger patients. Secondly, the discriminative accuracy and calibration were better in high-risk patients than in low-risk and intermediate-risk patients.
Risk stratification tools are used to determine the mortality risk in individual patients, but can also be used to facilitate operation program planning by optimizing patient mix, for quality assessment, and in benchmarking for comparisons between centres and surgeons. To achieve this, the tool needs to have high discriminative accuracy. The present study showed that EuroSCORE II had good discriminative accuracy when applied to a nationwide CABG cohort, and that the accuracy was mainly independent of age and sex. The overall AUC in the present study (0.82) was comparable to the accuracy achieved in the original validation data set of EuroSCORE II [3]. The acceptable overall discriminative accuracy of EuroSCORE II has been confirmed in several studies in different cardiac surgery populations as well as in metaanalyses [6,7], showing an AUC of 0.77-0.81. The present study showed that the best discriminatory accuracy was detected in patients aged 60-69 years, and that this accuracy decreased somewhat with increasing age. These results are in accordance with those of Poullis et al., who suggested that the EuroSCORE II tool should be used with caution in patients > 70 years old [12]. The present finding of highest accuracy in patients aged 60-70 years can likely be explained by overrepresentation of patients of this age in the original EuroSCORE II dataset that was used to develop the score [3]. Besides the good discriminatory accuracy of Euro-SCORE II, the results from the present study showed a marked overestimation of mortality in our CABG population, with an overall O/E ratio of 0.57. In comparison, a meta-analysis based on 22 studies in 145,592 mixed cardiac surgery patients reported an O/E ratio of 1.02 [6], while a large study in 16,096 CABG patients found an O/E ratio of 0.72 [16]. The present study does not give any clear explanation for the lower observed mortality in our study, though it may be at least partly due to improved intraoperative and postoperative care in this more contemporary study population. Nevertheless, the results imply that EuroSCORE II needs to be calibrated for different populations and/or procedures.
We observed the lowest O/E ratio in younger patients, with a value of 0.26 for patients < 60 years and 0.42 for patients aged 60-69 years, while the ratio was 0.89 in patients ≥ 80 years. This was a surprising result, given that some smaller studies have indicated that EuroSCORE II overestimates mortality in octogenarians [9,10,13]. Hence, our hypothesis that EuroSCORE II would perform less well in octogenarians could not be confirmed, since the discrimination accuracy was only somewhat lower in older patients  18:40 and the calibration was better. We also hypothesized that EuroSCORE II would perform less well in patients with high surgical risk. This hypothesis was based on a study by Howell et al. [15] which showed low discriminative accuracy in high-risk patients (AUC: 0.65), and another study by Osnabrugge et al. [16] demonstrating a low O/E ratio (0.51) in high-risk aortic valve replacement patients. The results of the present study did not support our hypothesis, since both the discrimination accuracy and the calibration were better in high-risk than in low-risk and intermediate-risk patients.
The present study has both strengths and limitations. Strengths include the large nationwide study cohort, which is by far the largest yet used to examine the performance of EuroSCORE in relation to all three of age, sex, and surgical risk. Limitations include the definition of high, intermediate, and low surgical risk, which was adapted from the EACTS/ ESC guideline definition in aortic valve replacement patients [22]. A consensus definition in CABG patients is lacking.

Conclusion
EuroSCORE II showed a satisfactory discriminative accuracy when applied in a large cohort of CABG patients. However, it markedly overestimated the mortality risk in this study cohort, especially in younger patients. This poor calibration strongly suggests that it is necessary to calibrate EuroSCORE II for different study populations.

AUC
Area under the receiver operating characteristic curve BMI Body mass index CABG Coronary artery bypass grafting CCS Canadian cardiovascular society functional classification of angina CI Confidence interval EuroSCORE The European system for cardiac operative risk evaluation LVEF Left ventricular function NYHA New York heart association class of heart failure. O/E Observed/expected ratio SWEDEHEART Swedish web system for enhancement and development of evidence-based care in heart disease evaluated according to recommended therapies analysis of data, and writing of the report. MS was involved in the analysis of data and writing of the report. All authors read and approved the final manuscript.

Funding
Open access funding provided by University of Gothenburg. The study was supported by the Swedish state under the ALF agreement between the Swedish government and the county councils concerning economic support of research and education of doctors (Grant No. ALFGBG-942665 to SN), the Local Research and Development Council Skaraborg (VGFOUSKB-964094 to MK), Skaraborg Hospital Research Fund (VGFOUSKAS-936367 to MK, VGFOUSKAS-963405 to MK), and Västra Götaland Regional Research Fund (VGFOUREG-940375 to MK).

Availability of data and materials
The data underlying this article will be shared on reasonable request to the corresponding author.

Declarations Ethics approval and consent to participate
The study was approved by the Regional Ethics Board of Gothenburg (approval number: 139-16), which waived the need for individual patient consent due to the retrospective registry-based study design.